We explore spatial transcriptomics datasets (Xenium : Imaging based ; Visium : Sequencing based) using VoltRon. Load necessary prerequisites. VoltRon requires R > 4.4.

Visium Analysis

In the first section, we explore Mouse brain Visium dataset for two tissue sections, anterior and posterior. VoltRon allows multilayer data objects, allowing different tissue blocks, adjacent sections and multi-omics methods to be stored within a single object. Data is from Mouse Brain.

We load each experiment separately and merge into a single object.

Ant_Sec1 <- importVisium("../data/Mouse Brain/Sagittal_Anterior/Section1/",
                         sample_name = "Anterior1")
Pos_Sec1 <- importVisium("../data/Mouse Brain/Sagittal_Posterior/Section1/",
                         sample_name = "Posterior1")

MBrain_Sec <- merge(Ant_Sec1, Pos_Sec1)
## Merging metadata ...
## Merging blocks and layers ...
MBrain_Sec
## VoltRon Object 
## Anterior1: 
##   Layers: Section1 
## Posterior1: 
##   Layers: Section1 
## Assays: Visium(Main) 
## Features: RNA(Main)

Voltron data accessors are similar to Seurat. You have data slots for different assays which each contain meta+raw data.

The raw counts matrix represent RNA counts for each gene for each spatial barcode of the spot. Remember that Visium data quantifies data at a spot level, which may consist of multiple genes.

str(MBrain_Sec)
## Formal class 'VoltRon' [package "VoltRon"] with 6 slots
##   ..@ samples        :List of 2
##   .. ..$ Anterior1 :Formal class 'vrBlock' [package "VoltRon"] with 3 slots
##   .. .. .. ..@ layer    :List of 1
##   .. .. .. .. ..$ Section1:Formal class 'vrLayer' [package "VoltRon"] with 2 slots
##   .. .. .. .. .. .. ..@ assay       :List of 1
##   .. .. .. .. .. .. .. ..$ Visium:Formal class 'vrAssayV2' [package "VoltRon"] with 9 slots
##   .. .. .. .. .. .. .. .. .. ..@ data           :List of 2
##   .. .. .. .. .. .. .. .. .. .. ..$ RNA     : num [1:32285, 1:2695] 0 0 0 0 0 0 0 0 2 0 ...
##   .. .. .. .. .. .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:32285] "Xkr4" "Gm1992" "Gm19938" "Gm37381" ...
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:2695] "AAACAAGTATCTCCCA-1_Assay1" "AAACACCAATAACTGC-1_Assay1" "AAACAGAGCGACTCCT-1_Assay1" "AAACAGCTTTCAGAAG-1_Assay1" ...
##   .. .. .. .. .. .. .. .. .. .. ..$ RNA_norm: num [1:32285, 1:2695] 0 0 0 0 0 0 0 0 2 0 ...
##   .. .. .. .. .. .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:32285] "Xkr4" "Gm1992" "Gm19938" "Gm37381" ...
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:2695] "AAACAAGTATCTCCCA-1_Assay1" "AAACACCAATAACTGC-1_Assay1" "AAACAGAGCGACTCCT-1_Assay1" "AAACAGCTTTCAGAAG-1_Assay1" ...
##   .. .. .. .. .. .. .. .. .. ..@ featuredata    : list()
##   .. .. .. .. .. .. .. .. .. ..@ embeddings     : list()
##   .. .. .. .. .. .. .. .. .. ..@ image          :List of 1
##   .. .. .. .. .. .. .. .. .. .. ..$ main:Formal class 'vrSpatial' [package "VoltRon"] with 4 slots
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ coords      : num [1:2695, 1:3] 439 144 410 108 123 ...
##   .. .. .. .. .. .. .. .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:2695] "AAACAAGTATCTCCCA-1_Assay1" "AAACACCAATAACTGC-1_Assay1" "AAACAGAGCGACTCCT-1_Assay1" "AAACAGCTTTCAGAAG-1_Assay1" ...
##   .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:3] "x" "y" "z"
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ segments    : list()
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ image       :List of 1
##   .. .. .. .. .. .. .. .. .. .. .. .. .. ..$ H&E: 'bitmap' raw [1:3, 1:600, 1:600] b7 bb b9 b7 ...
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ main_channel: chr "H&E"
##   .. .. .. .. .. .. .. .. .. ..@ params         :List of 4
##   .. .. .. .. .. .. .. .. .. .. ..$ nearestpost.distance: num 10.3
##   .. .. .. .. .. .. .. .. .. .. ..$ spot.radius         : num 4.62
##   .. .. .. .. .. .. .. .. .. .. ..$ vis.spot.radius     : num 9.24
##   .. .. .. .. .. .. .. .. .. .. ..$ spot.type           : chr "circle"
##   .. .. .. .. .. .. .. .. .. ..@ type           : chr "spot"
##   .. .. .. .. .. .. .. .. .. ..@ name           : chr "Assay1"
##   .. .. .. .. .. .. .. .. .. ..@ main_image     : chr "main"
##   .. .. .. .. .. .. .. .. .. ..@ main_featureset: chr "RNA"
##   .. .. .. .. .. .. ..@ connectivity:Class 'igraph'  hidden list of 10
##   .. .. .. .. .. .. .. ..$ : num 2695
##   .. .. .. .. .. .. .. ..$ : logi FALSE
##   .. .. .. .. .. .. .. ..$ : num(0) 
##   .. .. .. .. .. .. .. ..$ : num(0) 
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ :List of 4
##   .. .. .. .. .. .. .. .. ..$ : num [1:3] 1 0 1
##   .. .. .. .. .. .. .. .. ..$ : Named list()
##   .. .. .. .. .. .. .. .. ..$ :List of 1
##   .. .. .. .. .. .. .. .. .. ..$ name: chr [1:2695] "AAACAAGTATCTCCCA-1_Assay1" "AAACACCAATAACTGC-1_Assay1" "AAACAGAGCGACTCCT-1_Assay1" "AAACAGCTTTCAGAAG-1_Assay1" ...
##   .. .. .. .. .. .. .. .. ..$ : Named list()
##   .. .. .. .. .. .. .. ..$ :<environment: 0x7fb21ca41190> 
##   .. .. .. ..@ zlocation: Named num 0
##   .. .. .. .. ..- attr(*, "names")= chr "Section1"
##   .. .. .. ..@ adjacency: num [1, 1] 0
##   .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. ..$ : chr "Section1"
##   .. .. .. .. .. ..$ : chr "Section1"
##   .. ..$ Posterior1:Formal class 'vrBlock' [package "VoltRon"] with 3 slots
##   .. .. .. ..@ layer    :List of 1
##   .. .. .. .. ..$ Section1:Formal class 'vrLayer' [package "VoltRon"] with 2 slots
##   .. .. .. .. .. .. ..@ assay       :List of 1
##   .. .. .. .. .. .. .. ..$ Visium:Formal class 'vrAssayV2' [package "VoltRon"] with 9 slots
##   .. .. .. .. .. .. .. .. .. ..@ data           :List of 2
##   .. .. .. .. .. .. .. .. .. .. ..$ RNA     : num [1:31053, 1:3351] 0 0 0 0 0 0 0 0 0 0 ...
##   .. .. .. .. .. .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:31053] "Xkr4" "Gm1992" "Gm37381" "Rp1" ...
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:3351] "AAACAAGTATCTCCCA-1_Assay2" "AAACACCAATAACTGC-1_Assay2" "AAACAGAGCGACTCCT-1_Assay2" "AAACAGCTTTCAGAAG-1_Assay2" ...
##   .. .. .. .. .. .. .. .. .. .. ..$ RNA_norm: num [1:31053, 1:3351] 0 0 0 0 0 0 0 0 0 0 ...
##   .. .. .. .. .. .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:31053] "Xkr4" "Gm1992" "Gm37381" "Rp1" ...
##   .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:3351] "AAACAAGTATCTCCCA-1_Assay2" "AAACACCAATAACTGC-1_Assay2" "AAACAGAGCGACTCCT-1_Assay2" "AAACAGCTTTCAGAAG-1_Assay2" ...
##   .. .. .. .. .. .. .. .. .. ..@ featuredata    : list()
##   .. .. .. .. .. .. .. .. .. ..@ embeddings     : list()
##   .. .. .. .. .. .. .. .. .. ..@ image          :List of 1
##   .. .. .. .. .. .. .. .. .. .. ..$ main:Formal class 'vrSpatial' [package "VoltRon"] with 4 slots
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ coords      : num [1:3351, 1:3] 437 141 408 106 120 ...
##   .. .. .. .. .. .. .. .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:3351] "AAACAAGTATCTCCCA-1_Assay2" "AAACACCAATAACTGC-1_Assay2" "AAACAGAGCGACTCCT-1_Assay2" "AAACAGCTTTCAGAAG-1_Assay2" ...
##   .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..$ : chr [1:3] "x" "y" "z"
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ segments    : list()
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ image       :List of 1
##   .. .. .. .. .. .. .. .. .. .. .. .. .. ..$ H&E: 'bitmap' raw [1:3, 1:600, 1:600] b9 bc ba b9 ...
##   .. .. .. .. .. .. .. .. .. .. .. .. ..@ main_channel: chr "H&E"
##   .. .. .. .. .. .. .. .. .. ..@ params         :List of 4
##   .. .. .. .. .. .. .. .. .. .. ..$ nearestpost.distance: num 10.3
##   .. .. .. .. .. .. .. .. .. .. ..$ spot.radius         : num 4.62
##   .. .. .. .. .. .. .. .. .. .. ..$ vis.spot.radius     : num 9.24
##   .. .. .. .. .. .. .. .. .. .. ..$ spot.type           : chr "circle"
##   .. .. .. .. .. .. .. .. .. ..@ type           : chr "spot"
##   .. .. .. .. .. .. .. .. .. ..@ name           : chr "Assay2"
##   .. .. .. .. .. .. .. .. .. ..@ main_image     : chr "main"
##   .. .. .. .. .. .. .. .. .. ..@ main_featureset: chr "RNA"
##   .. .. .. .. .. .. ..@ connectivity:Class 'igraph'  hidden list of 10
##   .. .. .. .. .. .. .. ..$ : num 3351
##   .. .. .. .. .. .. .. ..$ : logi FALSE
##   .. .. .. .. .. .. .. ..$ : num(0) 
##   .. .. .. .. .. .. .. ..$ : num(0) 
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ : NULL
##   .. .. .. .. .. .. .. ..$ :List of 4
##   .. .. .. .. .. .. .. .. ..$ : num [1:3] 1 0 1
##   .. .. .. .. .. .. .. .. ..$ : Named list()
##   .. .. .. .. .. .. .. .. ..$ :List of 1
##   .. .. .. .. .. .. .. .. .. ..$ name: chr [1:3351] "AAACAAGTATCTCCCA-1_Assay2" "AAACACCAATAACTGC-1_Assay2" "AAACAGAGCGACTCCT-1_Assay2" "AAACAGCTTTCAGAAG-1_Assay2" ...
##   .. .. .. .. .. .. .. .. ..$ : Named list()
##   .. .. .. .. .. .. .. ..$ :<environment: 0x7fb21cb777b0> 
##   .. .. .. ..@ zlocation: Named num 0
##   .. .. .. .. ..- attr(*, "names")= chr "Section1"
##   .. .. .. ..@ adjacency: num [1, 1] 0
##   .. .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. .. ..$ : chr "Section1"
##   .. .. .. .. .. ..$ : chr "Section1"
##   ..@ metadata       :Formal class 'vrMetadata' [package "VoltRon"] with 5 slots
##   .. .. ..@ molecule:Classes 'data.table' and 'data.frame':  0 obs. of  0 variables
##   .. .. .. ..- attr(*, ".internal.selfref")=<externalptr> 
##   .. .. ..@ cell    :'data.frame':   0 obs. of  0 variables
##   .. .. ..@ spot    :'data.frame':   6046 obs. of  6 variables:
##   .. .. .. ..$ id      : chr [1:6046] "AAACAAGTATCTCCCA-1_Assay1" "AAACACCAATAACTGC-1_Assay1" "AAACAGAGCGACTCCT-1_Assay1" "AAACAGCTTTCAGAAG-1_Assay1" ...
##   .. .. .. ..$ Count   : num [1:6046] 13991 39797 29951 42333 35700 ...
##   .. .. .. ..$ assay_id: chr [1:6046] "Assay1" "Assay1" "Assay1" "Assay1" ...
##   .. .. .. ..$ Assay   : chr [1:6046] "Visium" "Visium" "Visium" "Visium" ...
##   .. .. .. ..$ Layer   : chr [1:6046] "Section1" "Section1" "Section1" "Section1" ...
##   .. .. .. ..$ Sample  : chr [1:6046] "Anterior1" "Anterior1" "Anterior1" "Anterior1" ...
##   .. .. ..@ ROI     :'data.frame':   0 obs. of  0 variables
##   .. .. ..@ tile    :Classes 'data.table' and 'data.frame':  0 obs. of  0 variables
##   .. .. .. ..- attr(*, ".internal.selfref")=<externalptr> 
##   ..@ sample.metadata:'data.frame':  2 obs. of  3 variables:
##   .. ..$ Assay : chr [1:2] "Visium" "Visium"
##   .. ..$ Layer : chr [1:2] "Section1" "Section1"
##   .. ..$ Sample: chr [1:2] "Anterior1" "Posterior1"
##   ..@ graph          : list()
##   ..@ main.assay     : chr "Visium"
##   ..@ project        : chr "VoltRon"
SampleMetadata(MBrain_Sec)
##         Assay    Layer     Sample
## Assay1 Visium Section1  Anterior1
## Assay2 Visium Section1 Posterior1
MBrain_Sec[["Assay1"]]@data$RNA[1:10,1:10]  
##         AAACAAGTATCTCCCA-1_Assay1 AAACACCAATAACTGC-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         1
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          2                         1
## Lypla1                          0                         1
##         AAACAGAGCGACTCCT-1_Assay1 AAACAGCTTTCAGAAG-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         0
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          4                         0
## Lypla1                          2                         2
##         AAACAGGGTCTATATT-1_Assay1 AAACATGGTGAGAGGA-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         2
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          2                         1
## Lypla1                          0                         0
##         AAACATTTCCCGGATT-1_Assay1 AAACCGGGTAGGTACC-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         0
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          2                         3
## Lypla1                          1                         3
##         AAACCGTTCGTCCAGG-1_Assay1 AAACCTAAGCAGCCGG-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           1                         0
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          0                         6
## Lypla1                          1                         3

VoltRon workflow is very similar to Seurat’s single cell analysis workflow. Normalization -> Feature Selection -> PCA -> UMAP -> Clustering -> Vizualisation

Normalization is performed by logNorm Transform : Normalize for read depth and log transform. This is analogous to TPM transformation by DESeq2.

head(vrFeatures(MBrain_Sec)) ##gene names 
## [1] "Xkr4"    "Gm1992"  "Gm19938" "Gm37381" "Rp1"     "Sox17"
length(vrFeatures(MBrain_Sec))
## [1] 33502
# normalize and select features
MBrain_Sec <- normalizeData(MBrain_Sec)

TPM transformation is not suited for INTERSAMPLE comparisons. To calculate variable features, Variance Stabilizing Transformation (VST) is applied. logNorm Transformation is NOT a homoscedastic transformation .i.e., variance is not constant over the entire range of RNA mean count values. VST is more appropriate for comparison across samples as it corrects for this mean-dispersion relationship. This is important as a key assumption of linear regression (Ordinary Least Squares) assumes that this distribution is homoscedastic (see comment from DESEQ2 developer here https://support.bioconductor.org/p/103983/)

MBrain_Sec <- getFeatures(MBrain_Sec, n = 3000)

# selected features
head(vrFeatureData(MBrain_Sec))
##                 mean          var    adj_var  rank
## Xkr4    0.0248608534 0.0249941807 0.02800216 14114
## Gm1992  0.0000000000 0.0000000000 0.00000000     0
## Gm19938 0.0285714286 0.0322197476 0.03224908 13889
## Gm37381 0.0000000000 0.0000000000 0.00000000     0
## Rp1     0.0003710575 0.0003710575 0.00000000     0
## Sox17   0.1907235622 0.2219629135 0.23715920 10304
vrFeatureData(MBrain_Sec) %>% as.data.frame() %>% filter(rank!=0) %>% arrange(rank) %>% head(n=10)  ### top n variable features 
##              mean        var    adj_var rank
## Bc1     1033.5792 397525.510 393519.537    1
## mt-Co1   651.1748  84108.144 112094.705    2
## mt-Co3   631.3306  71441.495 103254.732    3
## mt-Atp6  570.9477  57449.132  79227.006    4
## mt-Co2   430.0456  35673.594  38212.835    5
## mt-Cytb  325.4349  18950.586  19196.591    6
## mt-Nd4   262.5848  13087.387  11549.235    7
## mt-Nd1   197.4783   7809.633   6075.308    8
## mt-Nd2   184.4805   6635.325   5240.770    9
## Fth1     118.1859   3095.668   2109.195   10
selected_features <- getVariableFeatures(MBrain_Sec)
head(selected_features, 20)  ### alternate approach to get the most variable features -> Should match with above approach
##  [1] "Bc1"     "mt-Co1"  "mt-Co3"  "mt-Atp6" "mt-Co2"  "mt-Cytb" "mt-Nd4" 
##  [8] "mt-Nd1"  "mt-Nd2"  "Fth1"    "Hbb-bs"  "Cst3"    "Gapdh"   "Tmsb4x" 
## [15] "Mbp"     "Rplp1"   "Ttr"     "Ppia"    "Ckb"     "mt-Nd3"

We have selected 3000 most variable features which will be used for PCA. We use 30 of the most informative Principal Components for UMAP (note that ideally you need to perform an elbow analysis (cumulative % variance explained vs number of PCs) to pick the ideal number).

# embedding
MBrain_Sec <- getPCA(MBrain_Sec, features = selected_features, dims = 30)
MBrain_Sec <- getUMAP(MBrain_Sec, dims = 1:30)
vrEmbeddingNames(MBrain_Sec)
## [1] "pca"  "umap"
vrEmbeddingPlot(MBrain_Sec, embedding = "umap")

We can observe that both sections cluster separately (batch effect) and within each section, there are variable number of clusters. We can now extract clusters using Leiden-based Graph Clustering (SNN: shared nearest neighbours), which works on the assumption that nodes that share a lot of neighbours must belong to the same neighbourhood.

MBrain_Sec <- getProfileNeighbors(MBrain_Sec, dims = 1:30, k = 10, method = "SNN")
vrGraphNames(MBrain_Sec)
## [1] "SNN"
# clustering
MBrain_Sec <- getClusters(MBrain_Sec, resolution = 0.5, label = "Clusters", graph = "SNN")

We can visualize these clusters of spots that have similar expression profile using their UMAP embeddings. We can then overlay these cluster annotations back on the Visium tissue section.

vrEmbeddingPlot(MBrain_Sec, embedding = "umap", group.by = "Clusters")

vrSpatialPlot(MBrain_Sec,group.by = "Clusters")

We can observe the presence of different spot neighbourhoods. OF NOTE, there is a strong batch effect where Posterior and Anterior spot types cluster separately. There may be many shared spots between the two sections but they can only be obtained by ALIGNMENT + INTEGRATION (similar to Seurat) which will be explored later.

So far, we have managed to cluster spots, but each spot can have multiple cells. To get which ‘kinds’ of cells are located within spots, you need to use a scRNAseq reference dataset to ‘deconvolve’ which kinds of cells are located within a given spot.

First we load a SeuratObject containing reference scRNAseq for similar tissue section from Allen Institute and then visualize the annotated single cell identities.

Class contains the large groupings while Subclass identity is annotated cell type.

allen_reference <- readRDS("../data/Mouse Brain/scRNA Mouse Brain/allen_cortex_analyzed_subset.rds")

Idents(allen_reference) <- "subclass"
gsubclass <- DimPlot(allen_reference, reduction = "umap", label = T) + NoLegend()
Idents(allen_reference) <- "class"
gclass <- DimPlot(allen_reference, reduction = "umap", label = T) + NoLegend()
gsubclass | gclass

Now, we use the tool spacexr to deconvolve the spot-level expression profile (RNA counts) of 33000 genes into proportions of the 22 cell types (notice the output features as well as data structure).

# (OPTIONAL) deconvolute spots, - TAKES A LOT OF TIME !
#MBrain_Sec <- getDeconvolution(MBrain_Sec, sc.object = allen_reference,
#                               sc.cluster = "subclass", max_cores = 2)

# Load the object with Deconvolved spots
MBrain_Sec <- readRDS("../data/Mouse Brain/MBrain_Sec_decon.rds")

MBrain_Sec[["Assay1"]]@data$RNA[1:10,1:10] #original datastructure
##         AAACAAGTATCTCCCA-1_Assay1 AAACACCAATAACTGC-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         1
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          2                         1
## Lypla1                          0                         1
##         AAACAGAGCGACTCCT-1_Assay1 AAACAGCTTTCAGAAG-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         0
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          4                         0
## Lypla1                          2                         2
##         AAACAGGGTCTATATT-1_Assay1 AAACATGGTGAGAGGA-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         2
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          2                         1
## Lypla1                          0                         0
##         AAACATTTCCCGGATT-1_Assay1 AAACCGGGTAGGTACC-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           0                         0
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          2                         3
## Lypla1                          1                         3
##         AAACCGTTCGTCCAGG-1_Assay1 AAACCTAAGCAGCCGG-1_Assay1
## Xkr4                            0                         0
## Gm1992                          0                         0
## Gm19938                         0                         0
## Gm37381                         0                         0
## Rp1                             0                         0
## Sox17                           1                         0
## Gm37587                         0                         0
## Gm37323                         0                         0
## Mrpl15                          0                         6
## Lypla1                          1                         3
head(vrFeatures(MBrain_Sec))
## [1] "Xkr4"    "Gm1992"  "Gm19938" "Gm37381" "Rp1"     "Sox17"
vrMainFeatureType(MBrain_Sec) <- "Decon" ##change main feature to Decon
vrFeatures(MBrain_Sec)
##  [1] "Astro"      "Endo"       "L23 IT"     "L4"         "L5 IT"     
##  [6] "L5 PT"      "L6 CT"      "L6 IT"      "L6b"        "Lamp5"     
## [11] "Macrophage" "Meis2"      "NP"         "Oligo"      "Peri"      
## [16] "Pvalb"      "Serpinf1"   "SMC"        "Sncg"       "Sst"       
## [21] "Vip"        "VLMC"

Note that Deconvolved features also have a normalized version, which are proportion values modified by Center Log Transformation. Only this version can be used to perform UMAPs etc as standard proportion data violate assumptions regarding Euclidean distances which are used in clustering algorithms.

MBrain_Sec[["Assay1"]]@data$Decon_norm[,1:5] 
##            AAACAAGTATCTCCCA-1_Assay1 AAACACCAATAACTGC-1_Assay1
## Astro                   1.698754e-01              2.908419e-01
## Endo                    1.395982e-02              5.785393e-02
## L23 IT                  1.684346e-05              2.286099e-05
## L4                      8.626439e-04              8.144352e-02
## L5 IT                   1.684346e-05              2.286099e-05
## L5 PT                   9.690356e-03              2.286099e-05
## L6 CT                   2.421895e-02              2.286099e-05
## L6 IT                   1.684346e-05              1.251016e-02
## L6b                     7.548642e-03              1.489008e-01
## Lamp5                   1.684346e-05              4.466996e-02
## Macrophage              4.346777e-02              1.410800e-02
## Meis2                   2.266107e-02              7.894250e-02
## NP                      1.684346e-05              2.544591e-02
## Oligo                   3.538572e-01              2.059919e-02
## Peri                    3.202853e-03              1.540615e-02
## Pvalb                   3.270831e-01              5.841740e-02
## Serpinf1                2.880825e-03              2.286099e-05
## SMC                     1.281985e-02              2.685541e-02
## Sncg                    1.684346e-05              2.286099e-05
## Sst                     1.684346e-05              3.649810e-03
## Vip                     7.735332e-03              6.640011e-02
## VLMC                    1.822105e-05              5.381814e-02
##            AAACAGAGCGACTCCT-1_Assay1 AAACAGCTTTCAGAAG-1_Assay1
## Astro                   0.2029293654              1.326023e-01
## Endo                    0.0195660833              1.839709e-02
## L23 IT                  0.5151169742              2.664359e-05
## L4                      0.0853594035              1.832501e-02
## L5 IT                   0.0260719991              2.664359e-05
## L5 PT                   0.0000627498              4.072442e-02
## L6 CT                   0.0119014923              2.432275e-02
## L6 IT                   0.0001804881              1.543518e-03
## L6b                     0.0133915891              3.285837e-02
## Lamp5                   0.0013288224              4.604203e-02
## Macrophage              0.0239925418              4.077851e-02
## Meis2                   0.0000627498              4.740202e-01
## NP                      0.0000627498              3.335841e-03
## Oligo                   0.0191449142              9.315878e-02
## Peri                    0.0116787874              3.471556e-03
## Pvalb                   0.0215887193              3.883137e-02
## Serpinf1                0.0125397952              2.664359e-05
## SMC                     0.0022943027              1.754877e-03
## Sncg                    0.0141900712              2.664359e-05
## Sst                     0.0089490296              7.194713e-04
## Vip                     0.0000627498              2.051925e-02
## VLMC                    0.0095246221              8.488068e-03
##            AAACAGGGTCTATATT-1_Assay1
## Astro                   1.873607e-01
## Endo                    1.966593e-02
## L23 IT                  1.012480e-02
## L4                      1.781930e-05
## L5 IT                   1.781930e-05
## L5 PT                   9.934361e-03
## L6 CT                   2.458475e-02
## L6 IT                   1.481845e-02
## L6b                     1.781930e-05
## Lamp5                   2.414206e-05
## Macrophage              2.464287e-02
## Meis2                   4.898915e-01
## NP                      1.759034e-02
## Oligo                   6.936008e-02
## Peri                    3.688043e-03
## Pvalb                   3.888593e-02
## Serpinf1                2.422734e-05
## SMC                     7.311029e-04
## Sncg                    3.093749e-05
## Sst                     4.635876e-03
## Vip                     8.393463e-02
## VLMC                    1.781930e-05

Vizualize proportion of selected features across the tissue.

vrSpatialFeaturePlot(MBrain_Sec, features = c("L4", "L5 PT", "Oligo", "Vip"),
                     crop = TRUE, ncol = 2, alpha = 1, keep.scale = "all")

Normalize proportions and perform UMAP. Visium spots with similar proportions of cells will be clustered together !

vrMainFeatureType(MBrain_Sec) <- "Decon"
MBrain_Sec <- normalizeData(MBrain_Sec, method = "CLR")
MBrain_Sec <- getUMAP(MBrain_Sec, data.type = "norm", umap.key = "umap_niche")
vrEmbeddingPlot(MBrain_Sec, embedding = "umap_niche", group.by = "Sample")

Get clusters based on Shared Nearest Neighbours

# clustering
MBrain_Sec <- getProfileNeighbors(MBrain_Sec, data.type = "norm", method = "SNN", graph.key = "SNN_niche")
MBrain_Sec <- getClusters(MBrain_Sec, resolution = 0.4, graph = "SNN_niche", label = "Niche_Clusters")

# visualize clustering
g1 <- vrEmbeddingPlot(MBrain_Sec, embedding = "umap", group.by = "Sample")
g2 <- vrEmbeddingPlot(MBrain_Sec, embedding = "umap", group.by = "Niche_Clusters", label = TRUE)
g1 | g2

Vizualize UMAP clusters overlaid onto the tissue level spots. Some distant regions on the tissue may share a similar type of cellular niche !

# spatial clustering plot
vrSpatialPlot(MBrain_Sec, group.by = "Niche_Clusters", crop = TRUE, alpha = 1)

We can look at what cell types define certain cellular niches using a heatmap.

vrHeatmapPlot(MBrain_Sec, features = vrFeatures(MBrain_Sec), group.by = "Niche_Clusters",
              show_row_names = T, show_heatmap_legend = T)
## `use_raster` is automatically set to TRUE for a matrix with more than
## 2000 columns You can control `use_raster` argument by explicitly
## setting TRUE/FALSE to it.
## 
## Set `ht_opt$message = FALSE` to turn off this message.

Xenium Analysis

Unlike Visium datasets, Xenium datasets are very large as they store coordinates of each molecule. Further, as its imaging-based, the images need to be stored in a multi-resolution format (similar to HiC matrices). Voltron recommends treating Xenium data in on-disk (non-memory) formats. The storage format is HDF5 and treats data similar to BPCells for single-cell data.

Xen_R1 <- importXenium("../data/BreastCancer/Xenium_R1/outs/", 
                       sample_name = "XeniumR1", 
                       overwrite_resolution = TRUE, 
                       resolution_level = 3)
## Creating cell level assay ...
## Loading morphology_mip.ome.tif ...
##   Image Resolution (X:8854 Y:6444) ...
##   Increasing Contrast ...
##   Writing Tiff File ...
# Metadata
Xen_R1@metadata
## VoltRon Metadata Object 
## This object includes: 
##    167780 cells
head(Metadata(Xen_R1))
##                id Count assay_id  Assay    Layer   Sample
## 1_Assay1 1_Assay1    28   Assay1 Xenium Section1 XeniumR1
## 2_Assay1 2_Assay1    94   Assay1 Xenium Section1 XeniumR1
## 3_Assay1 3_Assay1     9   Assay1 Xenium Section1 XeniumR1
## 4_Assay1 4_Assay1    11   Assay1 Xenium Section1 XeniumR1
## 5_Assay1 5_Assay1    48   Assay1 Xenium Section1 XeniumR1
## 6_Assay1 6_Assay1     7   Assay1 Xenium Section1 XeniumR1
# filter out counts
Xen_R1 <- subset(Xen_R1, Count > 5)

Save the Xenium object ondisk. Now all operations will be performed and evaluated LAZILY (as opposed to eager execution).

#Not necessary to repeat the object save.

# Xen_R1_ondisk <- saveVoltRon(Xen_R1, 
#                              format = "HDF5VoltRon", 
#                              output = "./workshop/data/ondisk/Xen_R1", 
#                              replace = TRUE)

# load voltron from disk
Xen_R1_ondisk <- loadVoltRon("../data/ondisk/Xen_R1/")

Perform operations one cell-level counts. Note that cell level mRNA counts are generated using an additional segmentation reference channel (like cell boundary immunostatining)

# normalize
Xen_R1_ondisk <- normalizeData(Xen_R1_ondisk, sizefactor = 1000)

# PCA reduction
Xen_R1_ondisk <- getPCA(Xen_R1_ondisk, dims = 20, overwrite = TRUE)
Xen_R1_ondisk <- getUMAP(Xen_R1_ondisk, dims = 1:20, overwrite=TRUE)

vrEmbeddingPlot(Xen_R1_ondisk, embedding="umap")

Some diagnostic plots.

# neighbors
Xen_R1_ondisk <- getProfileNeighbors(Xen_R1_ondisk, dims = 1:20, method = "SNN")
vrGraphNames(Xen_R1_ondisk)
## [1] "SNN"
# clustering
Xen_R1_ondisk <- getClusters(Xen_R1_ondisk, resolution = 1.3, label = "Clusters", graph = "SNN")

# visualization
vrEmbeddingPlot(Xen_R1_ondisk, group.by = "Clusters", embedding = "umap", 
                pt.size = 0.4, label = TRUE)

# spatial plot
vrSpatialPlot(Xen_R1_ondisk, group.by = "Clusters", pt.size = 0.18)
## Warning: Converting to a dense matrix may use excessive memory
## This message is displayed once every 8 hours.
## Warning: Removed 689 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 450 rows containing missing values or values outside the scale range
## (`geom_tile()`).

Its possible to perform ‘Marker’ analyses like for single-cell RNAseq data. Voltron can transform objects into Seurat objects which can then be used with FindAllMarkers (which will identify cluster specific markers).

Xen_R1$Clusters <- Xen_R1_ondisk$Clusters
Xen_R1_seu <- VoltRon::as.Seurat(Xen_R1, cell.assay = "Xenium", type = "image")
## Warning: Data is of class matrix. Coercing to dgCMatrix.
Idents(Xen_R1_seu) <- "Clusters"
Xen_R1_seu <- NormalizeData(Xen_R1_seu, scale.factor = 1000)
## Normalizing layer: counts
markers <- FindAllMarkers(Xen_R1_seu)
## Calculating cluster 1
## Calculating cluster 2
## Calculating cluster 3
## Calculating cluster 4
## Calculating cluster 5
## Calculating cluster 8
## Calculating cluster 9
## Calculating cluster 12
## Calculating cluster 13
## Calculating cluster 10
## Calculating cluster 6
## Calculating cluster 11
## Calculating cluster 14
## Calculating cluster 15
## Calculating cluster 7
## Calculating cluster 16
## Calculating cluster 17
## Calculating cluster 18
head(markers,n=10)
##        p_val avg_log2FC pct.1 pct.2 p_val_adj cluster   gene
## LUM        0 -2.3262849 0.160 0.642         0       1    LUM
## TOMM7      0 -0.2859225 0.404 0.881         0       1  TOMM7
## ACTG2      0 -0.3697581 0.208 0.668         0       1  ACTG2
## RUNX1      0 -0.8051247 0.148 0.589         0       1  RUNX1
## POSTN      0 -2.2517031 0.144 0.577         0       1  POSTN
## ACTA2      0 -1.0456143 0.165 0.590         0       1  ACTA2
## VOPP1      0 -0.4154167 0.169 0.582         0       1  VOPP1
## CXCL12     0 -2.4820071 0.062 0.463         0       1 CXCL12
## DST        0 -0.3840689 0.228 0.627         0       1    DST
## CXCR4      0 -1.7840391 0.116 0.505         0       1  CXCR4

These markers can be used to classify cell types by an expert. Load predefined annotations and perform Niche Clustering Analysis.

In Visium, we deconvolve spots to get mixture of cells and then identify niches based on neighbouring spots that have same cell composition. In xenium, there are no spots, but rather we directly use the position of cells to identify neighbour cell types (and consequently define niches).

Xen_R1_ondisk <- getSpatialNeighbors(Xen_R1_ondisk, radius = 30, method = "radius")
vrGraphNames(Xen_R1_ondisk)
## [1] "SNN"    "radius"
# get niche assay
Xen_R1_ondisk <- getNicheAssay(Xen_R1_ondisk, label = "CellType", graph.type = "radius")
Xen_R1_ondisk
## VoltRon Object 
## XeniumR1: 
##   Layers: Section1 
## Assays: Xenium(Main) 
## Features: RNA(Main) Niche
# normalizing niche assay
vrMainFeatureType(Xen_R1_ondisk) <- "Niche"
Xen_R1_ondisk <- normalizeData(Xen_R1_ondisk, method = "CLR")

# clustering niches
Xen_R1_ondisk <- getClusters(Xen_R1_ondisk, nclus = 9, method = "kmeans", label = "niche_clusters")

# visualization
vrSpatialPlot(Xen_R1_ondisk, group.by = "niche_clusters", alpha = 1)
## Warning: Removed 689 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 448 rows containing missing values or values outside the scale range
## (`geom_tile()`).

library(ComplexHeatmap)
vrHeatmapPlot(Xen_R1_ondisk, features = vrFeatures(Xen_R1_ondisk), group.by = "niche_clusters")
## `use_raster` is automatically set to TRUE for a matrix with more than
## 2000 columns You can control `use_raster` argument by explicitly
## setting TRUE/FALSE to it.
## 
## Set `ht_opt$message = FALSE` to turn off this message.

# visualization of specific cell type
#vrSpatialPlot(Xen_R1_ondisk, group.by = "CellType", pt.size = 0.18, alpha = 1, group.ids = c("ACTA2_myoepithelial", "KRT15_myoepithelial"))
vrSpatialPlot(Xen_R1_ondisk, group.by = "CellType", pt.size = 1, alpha = 1, group.ids = c("CD4_TCells", "CD8_TCells", "BCells"), n.tile = 400)
## Warning: Removed 94 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 230 rows containing missing values or values outside the scale range
## (`geom_tile()`).

Hostpot analysis: Use Getis-Ord statistics to identify hotspots for a given feature. In other words, for a given feature, you compare local and global means to identify clusters of cells that have very high or very low values for the feature. Sample analyses for PGR gene expression. First you obtain a local cellular neighbourhood (cells contained within a ball of given radius) and perform hotspot analyses for PGR RNA counts.

# get spatial neighbor plot
Xen_R1_ondisk <- getSpatialNeighbors(Xen_R1_ondisk, method = "radius", radius = 15, graph.key = "radius_hot")

# visualize 
vrMainFeatureType(Xen_R1_ondisk) <- "RNA"
vrSpatialFeaturePlot(Xen_R1_ondisk, features = "PGR", alpha = 1, background.color = "black", n.tile = 300)

# analysis 
Xen_R1_ondisk <- getHotSpotAnalysis(Xen_R1_ondisk, features = "PGR", graph.type = "radius_hot", alpha.value = 0.001)
## Running Hot Spot Analysis with 'Getis-Ord' for 'Assay1'
# visualize
vrSpatialFeaturePlot(Xen_R1_ondisk, features = "PGR_hotspot_stat", alpha = 1, background.color = "black", n.tile = 400)

vrSpatialPlot(Xen_R1_ondisk, group.by = "PGR_hotspot_flag", alpha = 1, background.color = "black", n.tile = 400)

Multimodal Image Alignment/Registration: VoltRon allows you to align multimodal data (for example Immunofluoresence or H&E staining) from the same or adjacent tissue sections. This can be done either manually (user selects equivalent points between two images) or automatically. Note: The below code chunk requires use of shiny and runs into issue with RMd, so see reference screenshot. The query image is the H&E staining from a Visium experiment (left) while reference image is Xenium experiment from adjacent tissue. Better to downscale resolution as otherwise the alignment focuses (erroneously) onto small scale fine details. The bottom row shows connected points and overlay of query onto reference.

# HW_Xen_R1_ondisk <- loadVoltRon("../data/ondisk/Xen_R1/")
# HW_Vis <- importVisium("../data/BreastCancer/Visium/")
# 
# HW_reg <- registerSpatialData(object_list = list(HW_Vis, HW_Xen_R1_ondisk))

knitr::include_graphics(normalizePath("../Assignment_Registration.pdf"))

Now transformed coordinates will be available for the query image. These coordinates can be used to map back to the reference and viz them side-by-side. Same way, one can add manual ROIs onto Xenium images for ROI specific quantifications etc.